Us Election Tweets Sentimental Analysis

This notebook analyzes the sentiment in tweets about two president candiate, Joe Biden and Donard Trump. See if we have some interesting results

president.jpg

Analysis Steps

  1. Cleaning data
  2. Filter Data (Select tweets only in United State and in English)
  3. Evaluate sentiment score for each candidate
  4. Break down sentiment score for each state
  5. Visualize

Import modules

In [5]:
import numpy as np 
import pandas as pd

Reading the File

Read Joe Biden

In [6]:
joe_biden = pd.read_csv(r'hashtag_joebiden.csv')
joe_biden.head()
/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/IPython/core/interactiveshell.py:3071: DtypeWarning: Columns (1,2,3,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20) have mixed types.Specify dtype option on import or set low_memory=False.
  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
Out[6]:
created_at tweet_id tweet likes retweet_count source user_id user_name user_screen_name user_description ... user_followers_count user_location lat long city country continent state state_code collected_at
0 2020-10-15 00:00:01 1.316529221557252e+18 #Elecciones2020 | En #Florida: #JoeBiden dice ... 0.0 0.0 TweetDeck 360666534.0 El Sol Latino News elsollatinonews 🌐 Noticias de interés para latinos de la costa... ... 1860.0 Philadelphia, PA / Miami, FL 25.7743 -80.1937 NaN United States of America North America Florida FL 2020-10-21 00:00:00
1 2020-10-15 00:00:18 1.31652929585929e+18 #HunterBiden #HunterBidenEmails #JoeBiden #Joe... 0.0 0.0 Twitter for iPad 809904438.0 Cheri A. 🇺🇸 Biloximeemaw Locked and loaded Meemaw. Love God, my family ... ... 6628.0 NaN NaN NaN NaN NaN NaN NaN NaN 2020-10-21 00:00:00.517827283
2 2020-10-15 00:00:20 1.3165293050069524e+18 @IslandGirlPRV @BradBeauregardJ @MeidasTouch T... 0.0 0.0 Twitter Web App 3494182277.0 Flag Waver Flag_Wavers NaN ... 1536.0 Golden Valley Arizona 46.304 -109.171 NaN United States of America North America Montana MT 2020-10-21 00:00:01.035654566
3 2020-10-15 00:00:21 1.3165293080815575e+18 @chrislongview Watching and setting dvr. Let’s... 0.0 0.0 Twitter for iPhone 8.242596012018524e+17 Michelle Ferg MichelleFerg4 NaN ... 27.0 NaN NaN NaN NaN NaN NaN NaN NaN 2020-10-21 00:00:01.553481849
4 2020-10-15 00:00:22 1.316529312741253e+18 #censorship #HunterBiden #Biden #BidenEmails #... 1.0 0.0 Twitter Web App 1.032806955356545e+18 the Gold State theegoldstate A Silicon Valley #independent #News #Media #St... ... 390.0 California, USA 36.7015 -118.756 NaN United States of America North America California CA 2020-10-21 00:00:02.071309132

5 rows × 21 columns

Getting the Donard Trump Data

In [9]:
donard_trump = pd.read_csv('hashtag_donaldtrump.csv', lineterminator='\n', parse_dates=True) 
donard_trump.head()
Out[9]:
created_at tweet_id tweet likes retweet_count source user_id user_name user_screen_name user_description ... user_followers_count user_location lat long city country continent state state_code collected_at
0 2020-10-15 00:00:01 1.316529e+18 #Elecciones2020 | En #Florida: #JoeBiden dice ... 0.0 0.0 TweetDeck 3.606665e+08 El Sol Latino News elsollatinonews 🌐 Noticias de interés para latinos de la costa... ... 1860.0 Philadelphia, PA / Miami, FL 25.774270 -80.193660 NaN United States of America North America Florida FL 2020-10-21 00:00:00
1 2020-10-15 00:00:01 1.316529e+18 Usa 2020, Trump contro Facebook e Twitter: cop... 26.0 9.0 Social Mediaset 3.316176e+08 Tgcom24 MediasetTgcom24 Profilo ufficiale di Tgcom24: tutte le notizie... ... 1067661.0 NaN NaN NaN NaN NaN NaN NaN NaN 2020-10-21 00:00:00.373216530
2 2020-10-15 00:00:02 1.316529e+18 #Trump: As a student I used to hear for years,... 2.0 1.0 Twitter Web App 8.436472e+06 snarke snarke Will mock for food! Freelance writer, blogger,... ... 1185.0 Portland 45.520247 -122.674195 Portland United States of America North America Oregon OR 2020-10-21 00:00:00.746433060
3 2020-10-15 00:00:02 1.316529e+18 2 hours since last tweet from #Trump! Maybe he... 0.0 0.0 Trumpytweeter 8.283556e+17 Trumpytweeter trumpytweeter If he doesn't tweet for some time, should we b... ... 32.0 NaN NaN NaN NaN NaN NaN NaN NaN 2020-10-21 00:00:01.119649591
4 2020-10-15 00:00:08 1.316529e+18 You get a tie! And you get a tie! #Trump ‘s ra... 4.0 3.0 Twitter for iPhone 4.741380e+07 Rana Abtar - رنا أبتر Ranaabtar Washington Correspondent, Lebanese-American ,c... ... 5393.0 Washington DC 38.894992 -77.036558 Washington United States of America North America District of Columbia DC 2020-10-21 00:00:01.492866121

5 rows × 21 columns

Cleaning the Data after we import it

Cleaning Joe Biden Tweets

  1. remove empty values
  2. remove blank rows
In [4]:
joe_biden = joe_biden[joe_biden['tweet'].notna()]
joe_biden['tweet'] = joe_biden['tweet'].str.replace(r"[^A-Za-z0-9]", ' ')
#remove null value
joe_biden.dropna(inplace=True)
blanks = []  # start with an empty list

for row in joe_biden.itertuples():  # iterate over the DataFrame
    if type(row[2])==str:            # avoid NaN values
        if row[2].isspace():         # check for whitespace
            blanks.append(row['Index'])     # add matching index numbers to the list

joe_biden.drop(blanks, inplace=True)

Cleaning Donard Trump Tweets

In [5]:
donard_trump = donard_trump[donard_trump['tweet'].notna()]
donard_trump['tweet'] = donard_trump['tweet'].str.replace(r"[^A-Za-z0-9]", ' ')
#remove null value
donard_trump.dropna(inplace=True)
blanks = []  # start with an empty list

for row in donard_trump.itertuples():  # iterate over the DataFrame
    if type(row[2])==str:            # avoid NaN values
        if row[2].isspace():         # check for whitespace
            blanks.append(row['Index'])     # add matching index numbers to the list

donard_trump.drop(blanks, inplace=True)

We are only interested in the United State, so we only select tweets from United State

In [6]:
joe_biden=joe_biden[joe_biden.country == 'United States of America']
donard_trump = donard_trump[donard_trump.country == 'United States of America']

Let's only select the English Tweets

This is a two steps process

  1. First, we need to check whether the tweets are in English or another Language
  2. Filter data based on Language Label
In [7]:
!pip install langdetect
Collecting langdetect
  Downloading langdetect-1.0.8.tar.gz (981 kB)
     |████████████████████████████████| 981 kB 909 kB/s eta 0:00:01
Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from langdetect) (1.14.0)
Building wheels for collected packages: langdetect
  Building wheel for langdetect (setup.py) ... done
  Created wheel for langdetect: filename=langdetect-1.0.8-py3-none-any.whl size=993191 sha256=d90045763115b3ffcfab04afd0a4a37a51b7b4c7fcb5bbd9c2b014fa089f0ee1
  Stored in directory: /root/.cache/pip/wheels/59/f6/9d/85068904dba861c0b9af74e286265a08da438748ee5ae56067
Successfully built langdetect
Installing collected packages: langdetect
Successfully installed langdetect-1.0.8

Really lengthy operation, I exported the results to a csv

In [8]:
# from langdetect import detect
# joe_biden['lang'] = joe_biden['tweet'].apply(detect)
# joe_biden[['tweet','lang']].head()
# joe_biden = joe_biden[joe_biden['lang']== 'en']

# langs = []
# for row in donard_trump.itertuples():
#     try:
#         lang = detect(str(row[3]))
#         print((row.Index,lang))
#         langs.append((row.Index,lang))
#     except LangDetectException:
#         langs.append((row.Index,'Error'))
    
    
# langs_trump_merge = langs_trump_merge[langs_trump_merge['lang']== 'en']
# langs_trump_merge.to_csv('donard_trump.csv')
In [9]:
# langs_trump_df = pd.DataFrame(langs,columns=['index','lang']).set_index(['index'])
# langs_trump_merge = donard_trump.merge(langs_trump_df, left_index=True, right_index=True)

Reading existing file

In [12]:
joe_biden = pd.read_csv(r'joe_biden.csv')
joe_biden.head()
Out[12]:
Unnamed: 0 created_at tweet_id tweet likes retweet_count source user_id user_name user_screen_name ... user_location lat long city country continent state state_code collected_at lang
0 6 2020-10-15 00:00:25 1.316529e+18 In 2020 NYPost is being censorship CENSORE... 0.0 0.0 Twitter for iPhone 1.994033e+07 Change Illinois | Biden will increase taxes by... changeillinois ... Chicago, Illinois 41.875562 -87.624421 Chicago United States of America North America Illinois IL 2020-10-21 00:00:03.106963698 en
1 17 2020-10-15 00:01:23 1.316530e+18 Comments on this Do Democrats Understand how... 0.0 0.0 Twitter Web App 1.016593e+08 John Ubaldi ubaldireports ... Tampa, Florida 27.947759 -82.458444 Tampa United States of America North America Florida FL 2020-10-21 00:00:08.803063811 en
2 25 2020-10-15 00:01:57 1.316530e+18 RealJamesWoods BidenCrimeFamily JoeBiden H... 0.0 0.0 Twitter for Android 1.300837e+18 Sam KEYS SamKEYS65729181 ... Los Angeles, CA 34.053691 -118.242766 Los Angeles United States of America North America California CA 2020-10-21 00:00:12.945682075 en
3 29 2020-10-15 00:02:06 1.316530e+18 Come on ABC PLEASE DO THE RIGHT THING Move t... 0.0 0.0 Twitter Web App 3.343224e+08 Elphygirl Elphygirl ... New York, NY 40.712728 -74.006015 New York United States of America North America New York NY 2020-10-21 00:00:15.016991207 en
4 34 2020-10-15 00:02:23 1.316530e+18 realDonaldTrump addresses JoeBiden and Hunt... 0.0 1.0 Twitter for iPhone 3.381891e+09 Truth Hurts TheTruthSekr ... Minneapolis, MN 44.977300 -93.265469 Minneapolis United States of America North America Minnesota MN 2020-10-21 00:00:17.606127622 en

5 rows × 23 columns

In [11]:
donard_trump = pd.read_csv('donard_trump_en.csv')
donard_trump.head()
Out[11]:
Unnamed: 0 created_at tweet_id tweet likes retweet_count source user_id user_name user_screen_name ... user_location lat long city country continent state state_code collected_at lang
0 2 2020-10-15 00:00:02 1.316529e+18 Trump As a student I used to hear for years ... 2.0 1.0 Twitter Web App 8.436472e+06 snarke snarke ... Portland 45.520247 -122.674195 Portland United States of America North America Oregon OR 2020-10-21 00:00:00.746433060 en
1 4 2020-10-15 00:00:08 1.316529e+18 You get a tie And you get a tie Trump s ra... 4.0 3.0 Twitter for iPhone 4.741380e+07 Rana Abtar - رنا أبتر Ranaabtar ... Washington DC 38.894992 -77.036558 Washington United States of America North America District of Columbia DC 2020-10-21 00:00:01.492866121 en
2 11 2020-10-15 00:00:25 1.316529e+18 In 2020 NYPost is being censorship CENSORE... 0.0 0.0 Twitter for iPhone 1.994033e+07 Change Illinois | Biden will increase taxes by... changeillinois ... Chicago, Illinois 41.875562 -87.624421 Chicago United States of America North America Illinois IL 2020-10-21 00:00:04.105381834 en
3 12 2020-10-15 00:00:26 1.316529e+18 Trump PresidentTrump Trump2020LandslideVict... 3.0 5.0 Twitter for Android 1.243315e+18 Ron Burgundy Anchorman_USA ... San Diego, CA 32.717421 -117.162771 San Diego United States of America North America California CA 2020-10-21 00:00:04.478598364 en
4 22 2020-10-15 00:01:14 1.316530e+18 Trump Nobody likes to tell you this but som... 1.0 1.0 Twitter Web App 8.436472e+06 snarke snarke ... Portland 45.520247 -122.674195 Portland United States of America North America Oregon OR 2020-10-21 00:00:08.210763668 en

5 rows × 23 columns

Use NLTK sentimental analysis categorize whether tweet is positive negative or neutral

In [13]:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()
def rate_tweet(string):
    string = str(string)
    score = sid.polarity_scores(string)
    if score['compound'] > 0:
        return "POS"
    elif score['compound'] == 0:
        return "NEUTR"
    else:
        return "NEG"
def sentiment_score(string):
    string = str(string)
    score = sid.polarity_scores(string)
    return score['compound']

Evaluate sentiment score to the tweet

In [14]:
joe_biden['sentiment_score'] = joe_biden['tweet'].apply(sentiment_score)
joe_biden['sentiment'] = joe_biden['tweet'].apply(rate_tweet)
joe_biden[['tweet','lang','sentiment','sentiment_score']].head()
Out[14]:
tweet lang sentiment sentiment_score
0 In 2020 NYPost is being censorship CENSORE... en NEG -0.4707
1 Comments on this Do Democrats Understand how... en NEUTR 0.0000
2 RealJamesWoods BidenCrimeFamily JoeBiden H... en NEUTR 0.0000
3 Come on ABC PLEASE DO THE RIGHT THING Move t... en POS 0.7241
4 realDonaldTrump addresses JoeBiden and Hunt... en NEUTR 0.0000

Same Thing for Trump

In [15]:
donard_trump['sentiment_score'] = donard_trump['tweet'].apply(sentiment_score)
donard_trump['sentiment'] = donard_trump['tweet'].apply(rate_tweet)
donard_trump[['tweet','lang','sentiment','sentiment_score']].head()
Out[15]:
tweet lang sentiment sentiment_score
0 Trump As a student I used to hear for years ... en POS 0.3612
1 You get a tie And you get a tie Trump s ra... en NEUTR 0.0000
2 In 2020 NYPost is being censorship CENSORE... en NEG -0.4707
3 Trump PresidentTrump Trump2020LandslideVict... en POS 0.5267
4 Trump Nobody likes to tell you this but som... en POS 0.6956

Breakdown by State

Joe Biden

In [16]:
sentimental_summary_by_State = joe_biden[joe_biden.country == 'United States of America'].groupby(['state','sentiment']).size()

sentimental_percentage_summary_by_state = sentimental_summary_by_State.groupby(level=0).apply(lambda x:
                                                 x / float(x.sum()))

percentage = pd.DataFrame(sentimental_percentage_summary_by_state, columns=['percentage_of_sentiement'])
pd.set_option('display.max_rows', 500)
format_percentage = {'percentage_of_sentiement': '{:.2%}'}
percentage.style.format(format_percentage)
Out[16]:
percentage_of_sentiement
state sentiment
Alabama NEG 27.64%
NEUTR 29.09%
POS 43.27%
Alaska NEG 33.88%
NEUTR 26.45%
POS 39.67%
Arizona NEG 27.95%
NEUTR 30.20%
POS 41.84%
Arkansas NEG 16.67%
NEUTR 37.88%
POS 45.45%
California NEG 24.46%
NEUTR 32.60%
POS 42.94%
Colorado NEG 24.09%
NEUTR 29.84%
POS 46.07%
Connecticut NEG 23.24%
NEUTR 36.62%
POS 40.14%
Delaware NEG 9.52%
NEUTR 20.24%
POS 70.24%
District of Columbia NEG 22.52%
NEUTR 37.80%
POS 39.69%
Florida NEG 21.30%
NEUTR 39.77%
POS 38.92%
Georgia NEG 22.41%
NEUTR 35.68%
POS 41.91%
Hawaii NEG 28.86%
NEUTR 28.86%
POS 42.29%
Idaho NEG 22.98%
NEUTR 35.40%
POS 41.61%
Illinois NEG 24.68%
NEUTR 32.03%
POS 43.29%
Indiana NEG 16.85%
NEUTR 20.55%
POS 62.60%
Iowa NEG 18.38%
NEUTR 36.03%
POS 45.59%
Kansas NEG 25.38%
NEUTR 33.85%
POS 40.77%
Kentucky NEG 21.78%
NEUTR 33.66%
POS 44.55%
Louisiana NEG 20.90%
NEUTR 38.43%
POS 40.67%
Maine NEG 25.00%
NEUTR 62.50%
POS 12.50%
Maryland NEG 24.27%
NEUTR 31.15%
POS 44.58%
Massachusetts NEG 26.73%
NEUTR 34.09%
POS 39.18%
Michigan NEG 24.49%
NEUTR 34.59%
POS 40.92%
Minnesota NEG 28.22%
NEUTR 30.20%
POS 41.58%
Mississippi NEG 46.81%
NEUTR 13.83%
POS 39.36%
Missouri NEG 26.73%
NEUTR 32.12%
POS 41.15%
Montana NEG 36.36%
NEUTR 18.18%
POS 45.45%
Nebraska NEG 22.82%
NEUTR 40.94%
POS 36.24%
Nevada NEG 24.87%
NEUTR 32.06%
POS 43.07%
New Hampshire NEG 21.43%
NEUTR 25.00%
POS 53.57%
New Jersey NEG 30.00%
NEUTR 33.18%
POS 36.82%
New Mexico NEG 18.61%
NEUTR 40.69%
POS 40.69%
New York NEG 26.27%
NEUTR 31.76%
POS 41.96%
North Carolina NEG 22.12%
NEUTR 30.79%
POS 47.09%
North Dakota NEG 41.38%
NEUTR 20.69%
POS 37.93%
Ohio NEG 19.33%
NEUTR 45.18%
POS 35.49%
Oklahoma NEG 17.48%
NEUTR 35.66%
POS 46.85%
Oregon NEG 28.70%
NEUTR 27.31%
POS 43.99%
Pennsylvania NEG 24.55%
NEUTR 32.17%
POS 43.28%
Puerto Rico NEG 17.12%
NEUTR 39.64%
POS 43.24%
Rhode Island NEG 20.00%
NEUTR 45.71%
POS 34.29%
South Carolina NEG 30.99%
NEUTR 32.75%
POS 36.27%
South Dakota NEG 20.00%
NEUTR 20.00%
POS 60.00%
Tennessee NEG 23.53%
NEUTR 35.80%
POS 40.67%
Texas NEG 26.50%
NEUTR 32.47%
POS 41.02%
Utah NEG 10.06%
NEUTR 74.30%
POS 15.63%
Vermont NEG 14.29%
NEUTR 40.00%
POS 45.71%
Virginia NEG 26.43%
NEUTR 30.40%
POS 43.17%
Washington NEG 25.41%
NEUTR 32.97%
POS 41.62%
West Virginia NEG 17.39%
NEUTR 26.09%
POS 56.52%
Wisconsin NEG 21.74%
NEUTR 33.39%
POS 44.87%
Wyoming NEG 21.05%
NEUTR 15.79%
POS 63.16%

Trump

In [17]:
donard_trump_sentimental_summary_by_State = donard_trump[donard_trump.country == 'United States of America'].groupby(['state','sentiment']).size()

donard_trump_sentimental_percentage_summary_by_state = donard_trump_sentimental_summary_by_State.groupby(level=0).apply(lambda x:
                                                 x / float(x.sum()))

donard_trump_percentage = pd.DataFrame(donard_trump_sentimental_percentage_summary_by_state, columns=['percentage_of_sentiement'])
pd.set_option('display.max_rows', 500)
format_percentage = {'percentage_of_sentiement': '{:.2%}'}
donard_trump_percentage.style.format(format_percentage)
Out[17]:
percentage_of_sentiement
state sentiment
Alabama NEG 30.30%
NEUTR 18.18%
POS 51.52%
Alaska NEG 25.00%
NEUTR 50.00%
POS 25.00%
Arizona NEG 38.06%
NEUTR 26.12%
POS 35.82%
Arkansas NEG 28.57%
NEUTR 28.57%
POS 42.86%
California NEG 36.54%
NEUTR 22.95%
POS 40.52%
Colorado NEG 48.31%
NEUTR 22.46%
POS 29.24%
Connecticut NEG 27.27%
NEUTR 18.18%
POS 54.55%
District of Columbia NEG 40.02%
NEUTR 23.34%
POS 36.64%
Florida NEG 42.71%
NEUTR 19.90%
POS 37.39%
Georgia NEG 35.82%
NEUTR 28.36%
POS 35.82%
Hawaii NEG 39.29%
NEUTR 32.14%
POS 28.57%
Idaho NEG 46.15%
NEUTR 2.56%
POS 51.28%
Illinois NEG 46.04%
NEUTR 19.43%
POS 34.53%
Indiana NEG 37.50%
NEUTR 20.83%
POS 41.67%
Iowa NEG 50.00%
NEUTR 18.75%
POS 31.25%
Kansas NEG 23.08%
NEUTR 38.46%
POS 38.46%
Kentucky NEG 32.43%
NEUTR 21.62%
POS 45.95%
Louisiana NEG 40.00%
NEUTR 25.00%
POS 35.00%
Maine NEG 33.33%
NEUTR 66.67%
Maryland NEG 38.27%
NEUTR 23.46%
POS 38.27%
Massachusetts NEG 52.09%
NEUTR 17.67%
POS 30.23%
Michigan NEG 31.52%
NEUTR 15.76%
POS 52.73%
Minnesota NEG 32.58%
NEUTR 11.24%
POS 56.18%
Mississippi NEG 100.00%
Missouri NEG 40.74%
NEUTR 20.99%
POS 38.27%
Montana NEG 33.33%
NEUTR 16.67%
POS 50.00%
Nebraska NEG 29.41%
NEUTR 17.65%
POS 52.94%
Nevada NEG 35.19%
NEUTR 32.10%
POS 32.72%
New Hampshire NEG 50.00%
POS 50.00%
New Jersey NEG 45.95%
NEUTR 18.92%
POS 35.14%
New Mexico NEG 40.00%
NEUTR 30.00%
POS 30.00%
New York NEG 46.74%
NEUTR 19.76%
POS 33.50%
North Carolina NEG 43.92%
NEUTR 24.32%
POS 31.76%
Ohio NEG 36.75%
NEUTR 19.88%
POS 43.37%
Oklahoma NEG 21.43%
NEUTR 35.71%
POS 42.86%
Oregon NEG 41.72%
NEUTR 11.03%
POS 47.24%
Pennsylvania NEG 39.94%
NEUTR 23.27%
POS 36.79%
Puerto Rico NEG 54.55%
POS 45.45%
Rhode Island NEG 33.33%
NEUTR 16.67%
POS 50.00%
South Carolina NEG 42.11%
NEUTR 31.58%
POS 26.32%
Tennessee NEG 36.08%
NEUTR 20.25%
POS 43.67%
Texas NEG 32.33%
NEUTR 28.29%
POS 39.38%
Utah NEG 48.00%
NEUTR 32.00%
POS 20.00%
Vermont NEG 42.86%
NEUTR 57.14%
Virginia NEG 48.89%
NEUTR 10.00%
POS 41.11%
Washington NEG 46.22%
NEUTR 17.78%
POS 36.00%
West Virginia NEUTR 50.00%
POS 50.00%
Wisconsin NEG 38.89%
NEUTR 35.19%
POS 25.93%
Wyoming NEUTR 100.00%

This show percentage of postive, negative, and neutral tweets for each state.

Let visualize the sentimental score in a graph

In [18]:
import plotly.graph_objects as go
In [19]:
def scale(df):
    return df.sentiment_score * 100

state_over_all = joe_biden[joe_biden.country == 'United States of America'].groupby(['state_code']).apply(scale)
state_scale = state_over_all.groupby(level=0).apply(lambda x: x.mean())


joe_biden['text'] = joe_biden['state'].astype(str) + '<br>' + \
            'Sentiment Score: ' + joe_biden['sentiment_score'].astype(str) + '<br>' 

state_scale = state_scale.reset_index()
fig = go.Figure(data=go.Choropleth(
    locations=state_scale['state_code'], # Spatial coordinates
    z = state_scale['sentiment_score'].astype(float), # Data to be color-coded'
    text = joe_biden['text'],
    locationmode = 'USA-states', # set of locations match entries in `locations`
    colorscale = 'Blues',
    colorbar_title = "sentiment scale",
))

fig.update_layout(
    title_text = 'Joe Biden Sentimental Scale',
    geo_scope='usa', # limite map scope to USA
)

fig.show()

Trump Sentimental analysis

In [20]:
def scale(df):
    return df.sentiment_score * 100

state_over_all = donard_trump[donard_trump.country == 'United States of America'].groupby(['state_code']).apply(scale)
state_scale = state_over_all.groupby(level=0).apply(lambda x: x.mean())


state_scale = state_scale.reset_index()
fig = go.Figure(data=go.Choropleth(
    locations=state_scale['state_code'], # Spatial coordinates
    z = state_scale['sentiment_score'].astype(float), # Data to be color-coded
    locationmode = 'USA-states', # set of locations match entries in `locations`
    colorscale = 'Reds',
    colorbar_title = "sentiment scale",
))

fig.update_layout(
    title_text = 'Donard Trump Sentimental Scale',
    geo_scope='usa', # limite map scope to USA
)

fig.show()

Overall

In [21]:
all_joe_biden_rate = joe_biden.sentiment_score.sum()
all_joe_biden_rate

all_donard_trump_rate = donard_trump.sentiment_score.sum()

import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
plt.style.use('default')

x = ['Donard Trump']
x2 = ['Joe Biden']
biden_rating = [(float(all_joe_biden_rate)/10000)*100 ]
Donard_rating = [(float(all_donard_trump_rate)/10000)*100]
N= np.arange(2)


x_pos = [i for i, _ in enumerate(np.concatenate((x, x2)))]

plt.bar(x, Donard_rating, color ='red', label='Trump Rating',
        width = 0.4) 
plt.bar(x2, biden_rating, color ='Blue', label='Biden Rating', 
        width = 0.4) 

plt.xlabel("President Candidate")
plt.ylabel("Sentimental Score")
plt.title("Sentimental Analysis for Each President Candidate")
plt.legend(loc="upper left")

plt.xticks(x_pos, np.concatenate((x, x2)))
plt.gca().yaxis.set_major_formatter(mtick.PercentFormatter())
plt.show()
In [ ]: